Text Classification Using WordNet Hypernyms
نویسندگان
چکیده
This paper describes experiments in Machine Learning for text classification using a new representation of text based on WordNet hypernyms. Six binary classification tasks of varying difficulty are defined, and the Ripper system is used to produce discrimination rules for each task using the new hypernym density representation. Rules are also produced with the commonly used bag-of-words representation, incorporating no knowledge from WordNet. Experiments show that for some of the more difficult tasks the hypernym density representation leads to significantly more accurate and more comprehensible rules.
منابع مشابه
Evaluating WordNet Features in Text Classification Models
Incorporating semantic features from the WordNet lexical database is among one of the many approaches that have been tried to improve the predictive performance of text classification models. The intuition behind this is that keywords in the training set alone may not be extensive enough to enable generation of a universal model for a category, but if we incorporate the word relationships in Wo...
متن کاملCOMPARISON OF THE EFFECTS OF LEXICAL AND ONTOLOGICAL INFORMATION ON TEXT CATEGORIZATION by CESAR KOIRALA
ON TEXT CATEGORIZATION by CESAR KOIRALA (Under the Direction of Khaled Rasheed) ABSTRACT This thesis compares the effectiveness of using lexical and ontological information for text categorization. Lexical information has been induced using stemmed features. Ontological information, on the other hand, has been induced in the form of WordNet hypernyms. Text representations based on stemming and ...
متن کاملLexical Inference Mechanisms for Text Understanding and Classification
This paper describes a framework for building story traces (compact global views of a narrative) and story projections (selections of key elements of a narrative) and their applications in text understanding and classification. Word and sense properties are extracted using the WordNet lexical database enhanced with Prolog inference rules and a number of lexical transformations. Inference rules ...
متن کاملQuery Refinement and User Relevance Feedback for Contextualized Image Retrieval
The motivation of this paper is to increase the user perceived precision of results of Content Based Information Retrieval (CBIR) systems with Query Refinement (QR), Visual Analysis (VA) and Relevance Feedback (RF) algorithms. The proposed algorithms were implemented as modules into K-Space CBIR system. The QR module discovers hypernyms for the given query from a free text corpus (Wikipedia) an...
متن کاملUsing WordNet Hypernyms and Dependency Features for Phrasal-Level Event Recognition and Type Classification
The goal of this research is to devise a method for recognizing and classifying TimeML events in a more effective way. TimeML is the most recent annotation scheme for processing the event and temporal expressions in natural language processing fields. In this paper, we argue and demonstrate that unit feature dependency information and deep-level WordNet hypernyms are useful for event recognitio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998